Overview

An abnormality on a head computed tomography scan (HCT) is commonly used as an indication that a traumatic brain injury (TBI) event occurred for clinical studies, however, the diagnostic utility of a head CT abnormality for a TBI is not well understood. This project aims to identify whether a HCT abnormality is useful as a prognostic biomarker or whether other factors are more relevant for prognosis.

Introduction

There are over 2 million occurrences of traumatic brain injury (TBI) in the united states every year. Of those 2 million cases, there is currently no means of predicting who will recover and who will develop chronic long-lasting symptoms. The inability to determine who will recover without intervention makes designing clinical research studies sensitive to patient selection. The inclusion and exclusion criteria selected for a TBI study is crucial to the success or failure of that study. A common inclusion criteria for TBI studies is the presence of a tissue abnormality on a head computed tomography scan (HCT). It is utilized since it ensures that the underlying brain tissue is damaged after the impact, however, the diagnostic utility of HCT abnormalities has not been well characterized. This project aims to determine the diagnostic and prognostic utility of a HCT abnormality for TBI.

The analysis from this project will not only focus on HCT abnormalities as a biomarker for TBI, but will also investigate socioeconomic status and age as factors contributing to TBI outcome. The knowledge learned from this project will inform the medical field as to factors that make an individual susceptible to poor outcome after a TBI. This project is also aimed at taking a critical look at clinical research study design, and whether more informed decisions about subject selection is needed for future studies. Initial study design and subject selection criteria is important for the success of clinical research studies as well as epidemiological studies of secondary datasets.

Methods

The proposed project utilizes a cohort of 99 subjects recruited at Penn Presbyterian Medical Center. The subjects were recruited within 72 hours of their injury, and then followed-up at 2 weeks, 3 months, and 6 months. The dataset includes basic demographic information as well as neuropsychiatric tests for symptomology and functionality. Linear regression will be used to determine whether positive head CT is associated with long lasting symptoms or outcomes. The symptoms and outocmes being tested will be drawn from the standardized tests Rivermead symptoms inventory, BSI, GOSE, and Insomnia severity scale.

Data <- read.csv("/Users/margalithaber/Desktop/Data_12.4.2017.csv", header=TRUE)
#or for dummy data 
#Data <- read.csv(url("https://raw.githubusercontent.com/HaberM/BMIN503_Final_Project/master/DummyData.csv"), header=TRUE)

library(plyr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:plyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise, summarize
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:plyr':
## 
##     here
## The following object is masked from 'package:base':
## 
##     date
library(ggplot2)

Data.clean <- Data %>%
  mutate(gender=factor(sex_M, levels=c(0, 1, 2), labels=c(NA, "male", "female"))) %>%
  mutate(hct=factor(hct, levels=c(1, 2), labels=c("positive", "negative"))) %>%
  mutate(race=factor(race, levels=c(0, 1, 2, 3, 4, 5, 6), labels=c(NA, "Indian", "AlaskanNative.Inuit", "Asian", "NativeHawaiian.PacificIslander", "Black", "White"))) %>%
  mutate(severity=recode(gcs, '1'=3, '2'=3, '3'=3, '4'=3, '5'=3, '6'=3, '7'=3, '8'=3 , '9'=2, '10'=2, '11'=2, '12'=2 , '13'=1, '14'=1, '15'=1)) %>%
  mutate(severity=factor(severity, levels=c(1, 2, 3), labels=c("Mild", "Moderate","Severe"))) %>%
  mutate(alcohol=factor(alcohol, levels=c(1, 2), labels=c("Sober", "Drunk"))) %>%
  filter(group == 1) %>%
  filter(!is.na(age)) %>%
  filter(!is.na(hct)) %>%
 mutate(education_years=as.integer(as.character(education_years))) %>%
  mutate(Month=month(as.POSIXlt(injury_date, format="%m/%d/%Y")))
## Warning in eval(substitute(expr), envir, enclos): NAs introduced by coercion

Results

#demographics of study
nrow(Data.clean)
## [1] 79
summary(Data.clean$age)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   18.00   34.50   54.00   52.04   67.00   93.00
summary(Data.clean$education_years)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    9.00   12.00   12.00   13.35   15.00   20.00      30
summary(Data.clean$gcs)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    3.00   14.00   15.00   13.09   15.00   15.00       3
Pos.HCT <- subset(Data.clean,hct=='positive')
Neg.HCT <- subset(Data.clean,hct=='negative')
nrow(Pos.HCT)
## [1] 63
summary(Pos.HCT$age)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    18.0    37.0    56.0    53.6    69.0    93.0
summary(Pos.HCT$education_years)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##     9.0    12.0    12.0    13.3    15.0    20.0      20
summary(Pos.HCT$gcs)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    3.00   13.00   15.00   13.05   15.00   15.00       2
nrow(Neg.HCT)
## [1] 16
summary(Neg.HCT$age)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   26.00   31.50   42.00   45.88   56.50   75.00
summary(Neg.HCT$education_years)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   11.00   12.00   12.50   13.67   13.75   20.00      10
summary(Neg.HCT$gcs)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    3.00   14.50   15.00   13.27   15.00   15.00       1
#checking for age and education matching
#difference in age between positive and negative hct groups
ggplot(Data.clean,aes(age)) + 
    geom_histogram(data=subset(Data.clean,hct == 'positive'), aes(y=..density..), fill = "red", alpha = 0.4, bins = 20) +
    geom_histogram(data=subset(Data.clean,hct == 'negative'), aes(y=..density..), fill = "blue", alpha = 0.4, bins = 20) +
  labs(title="HCT+/- by Age") +
  theme_bw()

#difference in education level between positive and negative hct groups
ggplot(Data.clean,aes(education_years)) + 
    geom_histogram(data=subset(Data.clean,hct == 'positive'), aes(y=..density..), fill = "red", alpha = 0.4, bins = 10) +
    geom_histogram(data=subset(Data.clean,hct == 'negative'), aes(y=..density..), fill = "blue", alpha = 0.4, bins = 10) +
  labs(title="HCT+/- by Years of Education") +
  theme_bw()
## Warning: Removed 20 rows containing non-finite values (stat_bin).
## Warning: Removed 10 rows containing non-finite values (stat_bin).

#difference in GCS at ED presentation between positive and negative hct groups
ggplot(Data.clean,aes(gcs)) + 
    geom_histogram(data=subset(Data.clean,hct == 'positive'), aes(y=..density..), fill = "red", alpha = 0.4, bins = 10) +
    geom_histogram(data=subset(Data.clean,hct == 'negative'), aes(y=..density..), fill = "blue", alpha = 0.4, bins = 10) +
  labs(title="HCT+/- by GCS") +
  theme_bw()
## Warning: Removed 2 rows containing non-finite values (stat_bin).
## Warning: Removed 1 rows containing non-finite values (stat_bin).

The dataset consists of 79 TBI subjects with a median age of 54 (18-93) years and median education of 12 (9-20) years. The Glasgow coma scale score (GCS) is used to identify injury severity. The scores range from 3 (comatose) to 15 (awake and responding.) The median GCS was 15 (3-15). The positive HCT group (HCT+) consisted of 63 subjects with a median age of 56 (18-93) and median 12 years of education of (9-20). The median GCS for the HCT+ group was 15 (3-15.) The negative HCT group (HCT-) was slightly younger with a median age of 42 (26-75) years. The median GCS and years of education for the HCT- group was comparable to the the HCT+ group.

ggplot(data=Data.clean, aes(gender)) + 
      geom_bar(fill="black") +
      labs(title="Demographics by gender")+
      labs(x="Gender")+
      theme_bw()

ggplot(data=Data.clean, aes(x=race, fill=factor(gender))) +
      geom_bar(position="stack") +
      labs(title="Demographics by race") +
      labs(x="Race")+
      theme_bw()

ggplot(data=Data.clean, aes(x=alcohol, fill=factor(gender))) +
      geom_bar(position="dodge") +
      labs(title="Sobriety at ED presentation") +
      labs(x="Alcohol") +
      theme_bw()

ggplot(data=Data.clean, aes(age, gcs, color=gender)) + 
    geom_point() + 
    labs(title="GCS at ED presentation across age") +
    labs(x="Age", y="GCS") + #Adds a layer with labels
    theme_bw()
## Warning: Removed 3 rows containing missing values (geom_point).

There are proportionally more males enrolled in the study than females, which is consistant with previous literature regarding incidence of TBIs. The population consists of more white individuals. Incidence of inebriation at emergency department presentation for a TBI was higher in men than in women. The majority of subjects were sober at emergency department presentation. The majority of subjects in the dataset had a GCS of 15.

Severity for TBI is currently classified by glascow coma scale scores (GCS). TBI severity is generally broken down into 3 categories: Mild(gcs=15-13), Moderate(gcs=9-12), and Severe(gcs=1-8).

ggplot(data=Data.clean, aes(severity, age)) +
    geom_violin(fill="lavender", draw_quantiles = c(0.25, 0.5, 0.75)) +
    geom_jitter(height=0, width=0.1) +
    labs(title="Severity of injury at ED presentation") +
    xlab("Severity (GCS)") +
    ylab("Age") +
    theme_bw() #violin plot indicating age distribution across injury severity 

The population in this study showed a high proportion of Mildly injured subjects (primarily gcs=15). The age of subjects was well distributed over the Mild and Severe injury categories. There are few subjects falling within the Moderate injury category.

ggplot(data=Data.clean, aes(x=severity, fill=factor(hct))) +
      geom_bar(position="stack") +
      labs(title="HCT+/- by severity") +
      theme_bw()

HCT+ and HCT- subjects were found in both the Mild and Severe injury groups.

#Is a head ct abnormality related to functionality 2wks-1month after injury?
ggplot(data=Data.clean, aes(x=hct, y=gose_score_1mo)) +
    geom_boxplot(fill="lavender") +
    labs(title="HCT and GOSE: 2wk-1mo") +
    theme_bw()
## Warning: Removed 40 rows containing non-finite values (stat_boxplot).

summary((lm(gose_score_1mo~hct, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_1mo ~ hct, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.8108 -1.8108  0.1892  1.8446  2.1892 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.8108     0.3107  18.701   <2e-16 ***
## hctnegative  -0.3108     1.3721  -0.227    0.822    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.89 on 37 degrees of freedom
##   (40 observations deleted due to missingness)
## Multiple R-squared:  0.001385,   Adjusted R-squared:  -0.0256 
## F-statistic: 0.05131 on 1 and 37 DF,  p-value: 0.822
table(Data.clean$gose_score_3mo, Data.clean$hct)
##    
##     positive negative
##   4        1        0
##   5        4        0
##   6        4        0
##   7        7        0
##   8        5        1
table(Data.clean$gose_score_6mo, Data.clean$hct)
##    
##     positive negative
##   4        2        0
##   5        1        0
##   6        5        1
##   7        5        0
##   8        2        0
#not enough subjects in HCT- group for 3 month and 6 month followup visits to analyze. 

#Is a head ct abnormality related to symptoms 2wks-1month after injury?
ggplot(data=Data.clean, aes(x=hct, y=rpq3_1mo)) +
    geom_boxplot(fill="lavender") +
    labs(title="HCT and RPQ3: 2wk-1mo") +
    theme_bw()
## Warning: Removed 37 rows containing non-finite values (stat_boxplot).

summary((lm(rpq3_1mo~hct, data=Data.clean)))
## 
## Call:
## lm(formula = rpq3_1mo ~ hct, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5000 -2.3421 -0.3421  1.6579  5.6579 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.3421     0.4289   7.793 1.51e-09 ***
## hctnegative   0.1579     1.3896   0.114     0.91    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.644 on 40 degrees of freedom
##   (37 observations deleted due to missingness)
## Multiple R-squared:  0.0003226,  Adjusted R-squared:  -0.02467 
## F-statistic: 0.01291 on 1 and 40 DF,  p-value: 0.9101
#not significant

ggplot(data=Data.clean, aes(x=hct, y=rpq13_1mo)) +
    geom_boxplot(fill="lavender") +
    labs(title="HCT and RPQ13: 2wk-1mo") +
    theme_bw()
## Warning: Removed 37 rows containing non-finite values (stat_boxplot).

summary((lm(rpq13_1mo~hct, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_1mo ~ hct, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.7368  -6.4868   0.0066   5.2632  30.2632 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   13.737      1.431   9.597 6.24e-12 ***
## hctnegative   -1.487      4.638  -0.321     0.75    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.823 on 40 degrees of freedom
##   (37 observations deleted due to missingness)
## Multiple R-squared:  0.002563,   Adjusted R-squared:  -0.02237 
## F-statistic: 0.1028 on 1 and 40 DF,  p-value: 0.7502
#not significant

#Is a head ct abnormality related to sleep disturbances after injury?
ggplot(data=Data.clean, aes(x=hct, y=isi_falling_1mo)) +
    geom_boxplot(fill="lavender") +
    labs(title="HCT and Falling Asleep: 2wk-1mo") +
    theme_bw()
## Warning: Removed 37 rows containing non-finite values (stat_boxplot).

summary((glm(isi_falling_1mo~hct, data=Data.clean)))
## 
## Call:
## glm(formula = isi_falling_1mo ~ hct, data = Data.clean)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7500  -1.1842  -0.1842   0.8158   2.8158  
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.1842     0.2153  10.145 1.27e-12 ***
## hctnegative   0.5658     0.6977   0.811    0.422    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 1.761513)
## 
##     Null deviance: 71.619  on 41  degrees of freedom
## Residual deviance: 70.461  on 40  degrees of freedom
##   (37 observations deleted due to missingness)
## AIC: 146.92
## 
## Number of Fisher Scoring iterations: 2
#not significant

ggplot(data=Data.clean, aes(x=hct, y=isi_staying_1mo)) +
    geom_boxplot(fill="lavender") +
    labs(title="HCT and Staying Asleep: 2wk-1mo") +
    theme_bw()
## Warning: Removed 37 rows containing non-finite values (stat_boxplot).

summary((lm(isi_staying_1mo~hct, data=Data.clean)))
## 
## Call:
## lm(formula = isi_staying_1mo ~ hct, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.5000 -1.2632 -0.2632  0.7368  2.7368 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   2.2632     0.2058  10.998 1.16e-13 ***
## hctnegative   0.2368     0.6668   0.355    0.724    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.269 on 40 degrees of freedom
##   (37 observations deleted due to missingness)
## Multiple R-squared:  0.003144,   Adjusted R-squared:  -0.02178 
## F-statistic: 0.1262 on 1 and 40 DF,  p-value: 0.7243
#not significant

After reviewing the number of followup visits per HCT group, I discovered that the sample size for the HCT- group was not high enough to compare at 3 months (n=3) and 6 months (n=1) after injury. Analysis was focused on the subacute symptoms and outcome to HCT abnormalities, the 2 weeks to 1 month time point. Linear regression models were generatef using HCT as a predictor of several outcome metrics. GOSE looks at how functional the subject is in their daily life. RPQ is a symptoms inventory looking at self-reported symptoms like depression and headaches. Two insomnia indices were selected from the insomnia severity scale since sleep disturbances is a common symptom after a head injury. There was no significant difference between the HCT+ v HCT– groups for GOSE, RPQ symptoms assessment for 3 and 13, nor for the indicies of difficulty falling asleep or staying asleep. HCT abnormalities do not seem to be associated with subacute outcome measures.

Next I investigated whether there are any predictors for the occurance of HCT abnormalities.

#Does the severity of the injury affect whether the injury results in a head ct abnormality?
ggplot(data=Data.clean, aes(x=hct, fill=factor(severity))) +
      geom_bar(position="dodge") +
      labs(title="Severity vs. presence of HCT abnormalities") +
      labs(x="Severity(GCS)") +
      theme_bw()

summary((glm(hct~severity, data=Data.clean, family=binomial())))
## 
## Call:
## glm(formula = hct ~ severity, family = binomial(), data = Data.clean)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.6860  -0.6860  -0.6860  -0.6039   1.8930  
## 
## Coefficients:
##                   Estimate Std. Error z value Pr(>|z|)    
## (Intercept)        -1.3269     0.3120  -4.253 2.11e-05 ***
## severityModerate  -15.2392  1696.7344  -0.009    0.993    
## severitySevere     -0.2826     0.8351  -0.338    0.735    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 75.503  on 75  degrees of freedom
## Residual deviance: 74.491  on 73  degrees of freedom
##   (3 observations deleted due to missingness)
## AIC: 80.491
## 
## Number of Fisher Scoring iterations: 15
#not significant

#Is education related to whether the injury results in a head ct abnormality?
ggplot(data=Data.clean, aes(x=hct, y=education_years)) +
    labs(title="Years of Education v. HCT abnormalities") +
    labs(y= "Education (yrs)") +
    geom_boxplot(fill="lavender") +
    theme_bw()
## Warning: Removed 30 rows containing non-finite values (stat_boxplot).

summary((glm(hct~education_years, data=Data.clean, family=binomial())))
## 
## Call:
## glm(formula = hct ~ education_years, family = binomial(), data = Data.clean)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.6007  -0.5178  -0.4924  -0.4802   2.1059  
## 
## Coefficients:
##                 Estimate Std. Error z value Pr(>|z|)
## (Intercept)     -2.69043    2.27008  -1.185    0.236
## education_years  0.05348    0.16361   0.327    0.744
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 36.434  on 48  degrees of freedom
## Residual deviance: 36.330  on 47  degrees of freedom
##   (30 observations deleted due to missingness)
## AIC: 40.33
## 
## Number of Fisher Scoring iterations: 4
#Does age affect whether the injury results in a head ct abnormality?
ggplot(data=Data.clean, aes(x=hct, y=age)) +
    geom_boxplot(fill="lavender") +
    labs(title="Age vs. presence of HCT abnormalities") +
    theme_bw()

summary((glm(hct~age, data=Data.clean, family=binomial())))
## 
## Call:
## glm(formula = hct ~ age, family = binomial(), data = Data.clean)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.8751  -0.7184  -0.6153  -0.4970   2.0008  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.41678    0.73726  -0.565    0.572
## age         -0.01919    0.01436  -1.337    0.181
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 79.615  on 78  degrees of freedom
## Residual deviance: 77.751  on 77  degrees of freedom
## AIC: 81.751
## 
## Number of Fisher Scoring iterations: 4
#slight trend

#Does sobriety affect whether the injury results in a head ct abnormality?
ggplot(data=Data.clean, aes(x=alcohol, fill=factor(hct))) +
      geom_bar(position="fill") +
      labs(title="Sobriety v. HCT abnormalities") +
      labs(x="Alcohol") +
      theme_bw()

summary((glm(hct~alcohol, data=Data.clean, family=binomial())))
## 
## Call:
## glm(formula = hct ~ alcohol, family = binomial(), data = Data.clean)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.8383  -0.8383  -0.5415  -0.5415   1.9962  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   -1.8458     0.4393  -4.202 2.65e-05 ***
## alcoholDrunk   0.9808     0.6088   1.611    0.107    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 70.499  on 70  degrees of freedom
## Residual deviance: 67.866  on 69  degrees of freedom
##   (8 observations deleted due to missingness)
## AIC: 71.866
## 
## Number of Fisher Scoring iterations: 4
#stronger trend

#accounting for age as a covariate for the effect of sobriety on ocurrance of a HCT+
summary((glm(hct~alcohol+age, data=Data.clean, family=binomial())))
## 
## Call:
## glm(formula = hct ~ alcohol + age, family = binomial(), data = Data.clean)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.8582  -0.8149  -0.5418  -0.5258   2.0122  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  -1.708799   1.032154  -1.656   0.0978 .
## alcoholDrunk  0.944950   0.655423   1.442   0.1494  
## age          -0.002383   0.016341  -0.146   0.8841  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 70.499  on 70  degrees of freedom
## Residual deviance: 67.845  on 68  degrees of freedom
##   (8 observations deleted due to missingness)
## AIC: 73.845
## 
## Number of Fisher Scoring iterations: 4
#significance only slightly increased

Severity and years of education were not significantly associated with HCT+/-. Age showed a slight trend (p=0.18.) Sobriety had a stronger trend at a (p=0.1), suggesting that being drunk on emergency department presentation means you are more likely to have no HCT abnormality than being sober. When age is included in the model for sobriety, the p value increases minimally to 0.14. None of these predictors reached statistical significance.

#Does age affect injury outcome?
ggplot(data=Data.clean, aes(age, gose_score_1mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Age: 2wk-1mo") +
    labs(y="GOSE", x="Age") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 40 rows containing non-finite values (stat_smooth).
## Warning: Removed 40 rows containing missing values (geom_point).

summary((lm(gose_score_1mo~age, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_1mo ~ age, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1325 -1.7697  0.5406  1.6041  2.5090 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  5.23860    0.79900   6.556 1.11e-07 ***
## age          0.01052    0.01400   0.751    0.457    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.877 on 37 degrees of freedom
##   (40 observations deleted due to missingness)
## Multiple R-squared:  0.01503,    Adjusted R-squared:  -0.01159 
## F-statistic: 0.5646 on 1 and 37 DF,  p-value: 0.4572
#not significant

ggplot(data=Data.clean, aes(age, gose_score_3mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Age: 3mo") +
    labs(y="GOSE", x="Age") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 57 rows containing non-finite values (stat_smooth).
## Warning: Removed 57 rows containing missing values (geom_point).

summary((lm(gose_score_3mo~age, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_3mo ~ age, data = Data.clean)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -2.734 -0.800  0.266  0.983  1.618 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.04059    0.66177   9.128 1.43e-08 ***
## age          0.01101    0.01216   0.905    0.376    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.226 on 20 degrees of freedom
##   (57 observations deleted due to missingness)
## Multiple R-squared:  0.03936,    Adjusted R-squared:  -0.008671 
## F-statistic: 0.8195 on 1 and 20 DF,  p-value: 0.3761
#not significant

ggplot(data=Data.clean, aes(age, gose_score_6mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Age: 6mo") +
    labs(y="GOSE", x="Age") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 63 rows containing non-finite values (stat_smooth).
## Warning: Removed 63 rows containing missing values (geom_point).

summary((lm(gose_score_6mo~age, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_6mo ~ age, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.90245 -0.42831 -0.04751  0.69258  1.87995 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  6.93000    0.77897   8.896  3.9e-07 ***
## age         -0.01209    0.01280  -0.944    0.361    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.188 on 14 degrees of freedom
##   (63 observations deleted due to missingness)
## Multiple R-squared:  0.05987,    Adjusted R-squared:  -0.007284 
## F-statistic: 0.8915 on 1 and 14 DF,  p-value: 0.3611
#not signficant

ggplot(data=Data.clean, aes(age, rpq13_1mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Age: 2wk-1mo") +
    labs(y="RPQ13", x="Age") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 37 rows containing non-finite values (stat_smooth).
## Warning: Removed 37 rows containing missing values (geom_point).

summary((lm(rpq13_1mo~age, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_1mo ~ age, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -17.4092  -5.9231   0.1803   3.4925  26.1201 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 22.27287    3.35867   6.631 6.13e-08 ***
## age         -0.15689    0.05638  -2.783  0.00819 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.087 on 40 degrees of freedom
##   (37 observations deleted due to missingness)
## Multiple R-squared:  0.1622, Adjusted R-squared:  0.1413 
## F-statistic: 7.744 on 1 and 40 DF,  p-value: 0.008186
#significant with RPQ13

summary((lm(rpq13_1mo~age+gcs, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_1mo ~ age + gcs, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -17.5326  -5.3146   0.0522   3.6783  25.8965 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 22.38885    5.43705   4.118 0.000199 ***
## age         -0.15049    0.06427  -2.342 0.024545 *  
## gcs         -0.02389    0.43337  -0.055 0.956324    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.234 on 38 degrees of freedom
##   (38 observations deleted due to missingness)
## Multiple R-squared:  0.1543, Adjusted R-squared:  0.1097 
## F-statistic: 3.466 on 2 and 38 DF,  p-value: 0.04145
summary((lm(rpq13_1mo~age+severity, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_1mo ~ age + severity, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.0590  -4.1629  -0.1711   3.1748  27.3978 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      24.30328    4.22396   5.754 1.35e-06 ***
## age              -0.18108    0.06653  -2.722  0.00984 ** 
## severityModerate -6.86281    8.79909  -0.780  0.44038    
## severitySevere   -2.63088    4.06049  -0.648  0.52104    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.249 on 37 degrees of freedom
##   (38 observations deleted due to missingness)
## Multiple R-squared:  0.1736, Adjusted R-squared:  0.1065 
## F-statistic:  2.59 on 3 and 37 DF,  p-value: 0.06739
#age is still significant when adding severity or GCS to the model

ggplot(data=Data.clean, aes(age, rpq13_3mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Age: 3mo") +
    labs(y="RPQ13", x="Age") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 58 rows containing non-finite values (stat_smooth).
## Warning: Removed 58 rows containing missing values (geom_point).

summary((lm(rpq13_3mo~age, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_3mo ~ age, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.870  -6.767  -1.769   5.231  20.933 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)  15.8750     5.4433   2.916  0.00885 **
## age          -0.1003     0.1028  -0.976  0.34140   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.906 on 19 degrees of freedom
##   (58 observations deleted due to missingness)
## Multiple R-squared:  0.04773,    Adjusted R-squared:  -0.002392 
## F-statistic: 0.9523 on 1 and 19 DF,  p-value: 0.3414
#not significant

ggplot(data=Data.clean, aes(age, rpq13_6mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Age: 6mo") +
    labs(y="RPQ13", x="Age") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 65 rows containing non-finite values (stat_smooth).
## Warning: Removed 65 rows containing missing values (geom_point).

summary((lm(rpq13_6mo~age, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_6mo ~ age, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -12.650  -5.915  -1.009   2.904  23.154 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  21.4746     7.2927   2.945   0.0123 *
## age          -0.2010     0.1146  -1.754   0.1049  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.607 on 12 degrees of freedom
##   (65 observations deleted due to missingness)
## Multiple R-squared:  0.2041, Adjusted R-squared:  0.1378 
## F-statistic: 3.078 on 1 and 12 DF,  p-value: 0.1049
#trending but not significant


#Does number of years of eduction affect injury outcome?
ggplot(data=Data.clean, aes(education_years, gose_score_1mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Level of Education: 2wk-1mo") +
    labs(y="GOSE", x="Education (yrs)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 48 rows containing non-finite values (stat_smooth).
## Warning: Removed 48 rows containing missing values (geom_point).

summary((lm(gose_score_1mo~education_years, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_1mo ~ education_years, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.88774 -1.24546 -0.03403  1.18540  2.82769 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)   
## (Intercept)       0.8797     1.5342   0.573  0.57079   
## education_years   0.3577     0.1168   3.062  0.00471 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.691 on 29 degrees of freedom
##   (48 observations deleted due to missingness)
## Multiple R-squared:  0.2443, Adjusted R-squared:  0.2182 
## F-statistic: 9.373 on 1 and 29 DF,  p-value: 0.004714
#signficant

summary((lm(gose_score_1mo~education_years+age+gcs, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_1mo ~ education_years + age + gcs, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.8256 -1.2028  0.1719  1.3089  2.8736 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)  
## (Intercept)      0.36080    1.62944   0.221   0.8264  
## education_years  0.33677    0.12868   2.617   0.0143 *
## age             -0.01104    0.01536  -0.719   0.4784  
## gcs              0.10866    0.08604   1.263   0.2174  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.698 on 27 degrees of freedom
##   (48 observations deleted due to missingness)
## Multiple R-squared:  0.2905, Adjusted R-squared:  0.2117 
## F-statistic: 3.685 on 3 and 27 DF,  p-value: 0.02408
#still significant when accounting for age and gcs

ggplot(data=Data.clean, aes(education_years, gose_score_3mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Level of Education: 3mo") +
    labs(y="GOSE", x="Education (yrs)") + 
  geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 58 rows containing non-finite values (stat_smooth).
## Warning: Removed 58 rows containing missing values (geom_point).

summary((lm(gose_score_3mo~education_years, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_3mo ~ education_years, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4693 -0.7547  0.1014  0.5778  1.7689 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)  
## (Intercept)       3.6108     1.2998   2.778   0.0120 *
## education_years   0.2382     0.1027   2.320   0.0316 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.13 on 19 degrees of freedom
##   (58 observations deleted due to missingness)
## Multiple R-squared:  0.2207, Adjusted R-squared:  0.1797 
## F-statistic: 5.382 on 1 and 19 DF,  p-value: 0.03164
#significant

summary((lm(gose_score_3mo~education_years+age+gcs, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_3mo ~ education_years + age + gcs, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.51570 -0.87882  0.09121  0.85695  1.49378 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)  
## (Intercept)      3.905319   1.407403   2.775    0.013 *
## education_years  0.251281   0.120247   2.090    0.052 .
## age              0.005674   0.013860   0.409    0.687  
## gcs             -0.058652   0.074108  -0.791    0.440  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.172 on 17 degrees of freedom
##   (58 observations deleted due to missingness)
## Multiple R-squared:  0.2501, Adjusted R-squared:  0.1178 
## F-statistic:  1.89 on 3 and 17 DF,  p-value: 0.1696
#still at p=0.5 when accounting for age and gcs

ggplot(data=Data.clean, aes(education_years, gose_score_6mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Level of Education: 6mo") +
    labs(y="GOSE", x="Education (yrs)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 64 rows containing non-finite values (stat_smooth).
## Warning: Removed 64 rows containing missing values (geom_point).

summary((lm(gose_score_6mo~education_years, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_6mo ~ education_years, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4757 -0.2144  0.0469  0.8509  1.7856 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)  
## (Intercept)       4.6466     1.7346   2.679   0.0189 *
## education_years   0.1307     0.1375   0.950   0.3595  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.227 on 13 degrees of freedom
##   (64 observations deleted due to missingness)
## Multiple R-squared:  0.06491,    Adjusted R-squared:  -0.007019 
## F-statistic: 0.9024 on 1 and 13 DF,  p-value: 0.3595
#not signficant for GOSE at 6 months, may be due to small sample size

ggplot(data=Data.clean, aes(education_years, rpq3_1mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Level of Education: 2wk-1mo") +
    labs(y="RPQ13", x="Education (yrs)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 47 rows containing non-finite values (stat_smooth).
## Warning: Removed 47 rows containing missing values (geom_point).

summary((lm(rpq13_1mo~education_years, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_1mo ~ education_years, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.1448  -6.4482  -0.4845   5.9087  25.5347 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)   
## (Intercept)      28.4268     8.2656   3.439  0.00174 **
## education_years  -1.1068     0.6206  -1.783  0.08464 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.119 on 30 degrees of freedom
##   (47 observations deleted due to missingness)
## Multiple R-squared:  0.09586,    Adjusted R-squared:  0.06572 
## F-statistic: 3.181 on 1 and 30 DF,  p-value: 0.08464
#trending for RPQ13

ggplot(data=Data.clean, aes(education_years, rpq3_3mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Level of Education: 3mo") +
    labs(y="RPQ13", x="Education (yrs)") + 
  geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 59 rows containing non-finite values (stat_smooth).
## Warning: Removed 59 rows containing missing values (geom_point).

summary((lm(rpq13_3mo~education_years, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_3mo ~ education_years, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.5291  -4.5608  -0.0608   5.0721  14.5341 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      41.6813     9.5000   4.387 0.000355 ***
## education_years  -2.4684     0.7486  -3.297 0.004004 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.233 on 18 degrees of freedom
##   (59 observations deleted due to missingness)
## Multiple R-squared:  0.3766, Adjusted R-squared:  0.3419 
## F-statistic: 10.87 on 1 and 18 DF,  p-value: 0.004004
#highly significant

summary((lm(rpq13_3mo~education_years+age+gcs, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_3mo ~ education_years + age + gcs, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.6830  -4.4616  -0.5992   3.5559  19.1376 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)   
## (Intercept)     38.418231  10.074945   3.813  0.00153 **
## education_years -2.822446   0.876853  -3.219  0.00536 **
## age              0.001666   0.104393   0.016  0.98746   
## gcs              0.599836   0.531301   1.129  0.27554   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.383 on 16 degrees of freedom
##   (59 observations deleted due to missingness)
## Multiple R-squared:  0.4254, Adjusted R-squared:  0.3177 
## F-statistic: 3.949 on 3 and 16 DF,  p-value: 0.0277
#still significant when accounting for age and gcs

summary((lm(rpq13_6mo~education_years, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_6mo ~ education_years, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.0797  -5.3705  -1.9163   0.3386  21.0478 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)  
## (Intercept)       38.570     13.609   2.834   0.0163 *
## education_years   -2.291      1.065  -2.150   0.0546 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.363 on 11 degrees of freedom
##   (66 observations deleted due to missingness)
## Multiple R-squared:  0.2959, Adjusted R-squared:  0.2319 
## F-statistic: 4.623 on 1 and 11 DF,  p-value: 0.05464
ggplot(data=Data.clean, aes(education_years, rpq3_6mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Level of Education: 6mo") +
    labs(y="RPQ13", x="Education (yrs)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 66 rows containing non-finite values (stat_smooth).
## Warning: Removed 66 rows containing missing values (geom_point).

#almost significant for RPQ13 at p=0.055

#Does race affect injury outcome?
ggplot(data=Data.clean, aes(race, gose_score_1mo)) +
    labs(title="Injury Outcome by Race: 2wk-1mo") +
    geom_boxplot(fill="lavender")+
    theme_bw()
## Warning: Removed 40 rows containing non-finite values (stat_boxplot).

summary((lm(gose_score_1mo~race, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_1mo ~ race, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.7778 -1.7407  0.2222  1.2593  2.2593 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   5.0000     1.9179   2.607   0.0133 *
## raceAsian     2.0000     2.3489   0.851   0.4003  
## raceBlack     0.7778     2.0216   0.385   0.7028  
## raceWhite     0.7407     1.9531   0.379   0.7068  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.918 on 35 degrees of freedom
##   (40 observations deleted due to missingness)
## Multiple R-squared:  0.02734,    Adjusted R-squared:  -0.05603 
## F-statistic: 0.3279 on 3 and 35 DF,  p-value: 0.8052
ggplot(data=Data.clean, aes(race, gose_score_3mo)) +
    labs(title="Injury Outcome by Race: 3mo") +
    geom_boxplot(fill="lavender")+
    theme_bw()
## Warning: Removed 57 rows containing non-finite values (stat_boxplot).

summary((lm(gose_score_3mo~race, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_3mo ~ race, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5000 -0.6607  0.3929  1.0893  1.5000 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   7.0000     1.2759   5.486 2.71e-05 ***
## raceBlack    -0.2857     1.3640  -0.209    0.836    
## raceWhite    -0.5000     1.3206  -0.379    0.709    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.276 on 19 degrees of freedom
##   (57 observations deleted due to missingness)
## Multiple R-squared:  0.01244,    Adjusted R-squared:  -0.09151 
## F-statistic: 0.1197 on 2 and 19 DF,  p-value: 0.8879
ggplot(data=Data.clean, aes(race, gose_score_6mo)) +
    labs(title="Injury Outcome by Race: 6mo") +
    geom_boxplot(fill="lavender") + 
    theme_bw()
## Warning: Removed 63 rows containing non-finite values (stat_boxplot).

summary((lm(gose_score_6mo~race, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_6mo ~ race, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4167 -0.4167  0.2500  0.5833  1.5833 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   5.7500     0.5926   9.703 1.36e-07 ***
## raceWhite     0.6667     0.6843   0.974    0.346    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.185 on 14 degrees of freedom
##   (63 observations deleted due to missingness)
## Multiple R-squared:  0.06349,    Adjusted R-squared:  -0.003401 
## F-statistic: 0.9492 on 1 and 14 DF,  p-value: 0.3465
#not signifcant at any time point

ggplot(data=Data.clean, aes(race, gose_score_1mo)) +
    labs(title="Injury Outcome by Race: 2wk-1mo") +
    geom_boxplot(fill="lavender")+
    theme_bw()
## Warning: Removed 40 rows containing non-finite values (stat_boxplot).

summary((lm(rpq13_1mo~race, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_1mo ~ race, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.1071  -5.6071  -0.1071   4.9050  29.8929 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)    5.000      8.941   0.559    0.579
## raceAsian      8.500     10.951   0.776    0.442
## raceBlack      8.091      9.339   0.866    0.392
## raceWhite      9.107      9.100   1.001    0.323
## 
## Residual standard error: 8.941 on 38 degrees of freedom
##   (37 observations deleted due to missingness)
## Multiple R-squared:  0.02691,    Adjusted R-squared:  -0.04991 
## F-statistic: 0.3504 on 3 and 38 DF,  p-value: 0.7891
ggplot(data=Data.clean, aes(race, gose_score_3mo)) +
    labs(title="Injury Outcome by Race: 3mo") +
    geom_boxplot(fill="lavender")+
    theme_bw()
## Warning: Removed 57 rows containing non-finite values (stat_boxplot).

summary((lm(rpq13_3mo~race, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_3mo ~ race, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -14.231  -4.857   0.000   4.769  19.769 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   12.000      9.301   1.290    0.213
## raceBlack     -7.143      9.943  -0.718    0.482
## raceWhite      2.231      9.652   0.231    0.820
## 
## Residual standard error: 9.301 on 18 degrees of freedom
##   (58 observations deleted due to missingness)
## Multiple R-squared:  0.2047, Adjusted R-squared:  0.1164 
## F-statistic: 2.317 on 2 and 18 DF,  p-value: 0.1273
ggplot(data=Data.clean, aes(race, gose_score_6mo)) +
    labs(title="Injury Outcome by Race: 6mo") +
    geom_boxplot(fill="lavender") + 
    theme_bw()
## Warning: Removed 63 rows containing non-finite values (stat_boxplot).

summary((lm(rpq13_6mo~race, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_6mo ~ race, data = Data.clean)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.333 -6.125 -2.833  1.917 29.667 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   10.500      7.608   1.380    0.193
## raceWhite     -1.167      8.218  -0.142    0.889
## 
## Residual standard error: 10.76 on 12 degrees of freedom
##   (65 observations deleted due to missingness)
## Multiple R-squared:  0.001677,   Adjusted R-squared:  -0.08152 
## F-statistic: 0.02016 on 1 and 12 DF,  p-value: 0.8895
#not significant at any time point

#Does the time of year in which the injury occured affect outcome?
ggplot(data=Data.clean, aes(Month, gose_score_1mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Time of Year: 2wk-1mo") +
    labs(y="GOSE", x="Time of Year (month)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 41 rows containing non-finite values (stat_smooth).
## Warning: Removed 41 rows containing missing values (geom_point).

summary((lm(gose_score_1mo~Month, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_1mo ~ Month, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.98908 -1.40938  0.05678  1.07971  3.14627 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   4.6267     0.5310   8.713 2.15e-10 ***
## Month         0.2271     0.0857   2.650   0.0119 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.75 on 36 degrees of freedom
##   (41 observations deleted due to missingness)
## Multiple R-squared:  0.1632, Adjusted R-squared:  0.1399 
## F-statistic: 7.021 on 1 and 36 DF,  p-value: 0.01189
#significant with GOSE at 2wk-1mo

summary((lm(gose_score_1mo~Month+gcs+age, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_1mo ~ Month + gcs + age, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3.11920 -1.44143 -0.00481  1.18570  2.71770 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 2.862283   1.081491   2.647   0.0122 *
## Month       0.191159   0.087440   2.186   0.0358 *
## gcs         0.135423   0.083511   1.622   0.1141  
## age         0.004137   0.014074   0.294   0.7706  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.71 on 34 degrees of freedom
##   (41 observations deleted due to missingness)
## Multiple R-squared:  0.2453, Adjusted R-squared:  0.1787 
## F-statistic: 3.683 on 3 and 34 DF,  p-value: 0.02126
#still significant when accounting for age and gcs

ggplot(data=Data.clean, aes(Month, gose_score_3mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Time of Year: 3mo") +
    labs(y="GOSE", x="Time of Year (month)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 57 rows containing non-finite values (stat_smooth).
## Warning: Removed 57 rows containing missing values (geom_point).

summary((lm(gose_score_3mo~Month, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_3mo ~ Month, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5812 -0.7415  0.2051  0.9818  1.6325 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   6.1538     0.6860    8.97  1.9e-08 ***
## Month         0.1068     0.1548    0.69    0.498    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.237 on 20 degrees of freedom
##   (57 observations deleted due to missingness)
## Multiple R-squared:  0.02326,    Adjusted R-squared:  -0.02558 
## F-statistic: 0.4763 on 1 and 20 DF,  p-value: 0.4981
#not significant

ggplot(data=Data.clean, aes(Month, gose_score_6mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Time of Year: 6mo") +
    labs(y="GOSE", x="Time of Year (month)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 63 rows containing non-finite values (stat_smooth).
## Warning: Removed 63 rows containing missing values (geom_point).

summary((lm(gose_score_6mo~Month, data=Data.clean)))
## 
## Call:
## lm(formula = gose_score_6mo ~ Month, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.2343 -0.5481 -0.2343  0.8285  1.7657 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   6.9874     0.9577   7.296 3.93e-06 ***
## Month        -0.2510     0.3097  -0.811    0.431    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.197 on 14 degrees of freedom
##   (63 observations deleted due to missingness)
## Multiple R-squared:  0.04483,    Adjusted R-squared:  -0.0234 
## F-statistic: 0.6571 on 1 and 14 DF,  p-value: 0.4312
#not signifcant

ggplot(data=Data.clean, aes(Month, rpq13_1mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Time of Year: 2wk-1mo") +
    labs(y="RPQ13", x="Time of Year (month)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 38 rows containing non-finite values (stat_smooth).
## Warning: Removed 38 rows containing missing values (geom_point).

summary((lm(rpq13_1mo~Month, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_1mo ~ Month, data = Data.clean)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.4363  -6.4073   0.2062   5.2207  29.2497 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  15.7358     2.6499   5.938 6.31e-07 ***
## Month        -0.3285     0.3861  -0.851      0.4    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.757 on 39 degrees of freedom
##   (38 observations deleted due to missingness)
## Multiple R-squared:  0.01822,    Adjusted R-squared:  -0.006953 
## F-statistic: 0.7238 on 1 and 39 DF,  p-value: 0.4001
ggplot(data=Data.clean, aes(Month, rpq13_3mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Time of Year: 3mo") +
    labs(y="RPQ13", x="Time of Year (month)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 58 rows containing non-finite values (stat_smooth).
## Warning: Removed 58 rows containing missing values (geom_point).

summary((lm(rpq13_3mo~Month, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_3mo ~ Month, data = Data.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -10.966  -9.206  -0.726   6.274  23.274 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)  10.0068     5.7544   1.739   0.0982 .
## Month         0.2397     1.2822   0.187   0.8537  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.14 on 19 degrees of freedom
##   (58 observations deleted due to missingness)
## Multiple R-squared:  0.001837,   Adjusted R-squared:  -0.0507 
## F-statistic: 0.03496 on 1 and 19 DF,  p-value: 0.8537
ggplot(data=Data.clean, aes(Month, rpq13_6mo, color=severity)) +
    geom_point() + 
    labs(title="Injury Outcome by Time of Year: 6mo") +
    labs(y="RPQ13", x="Time of Year (month)") + 
    geom_smooth(color="black", method="lm") +
    theme_bw()
## Warning: Removed 65 rows containing non-finite values (stat_smooth).
## Warning: Removed 65 rows containing missing values (geom_point).

summary((lm(rpq13_6mo~Month, data=Data.clean)))
## 
## Call:
## lm(formula = rpq13_6mo ~ Month, data = Data.clean)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -9.895 -6.007 -2.970  1.461 29.530 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  10.7459     9.2230   1.165    0.267
## Month        -0.4254     2.9923  -0.142    0.889
## 
## Residual standard error: 10.76 on 12 degrees of freedom
##   (65 observations deleted due to missingness)
## Multiple R-squared:  0.001681,   Adjusted R-squared:  -0.08151 
## F-statistic: 0.02021 on 1 and 12 DF,  p-value: 0.8893
#not significant at any time point

Age, Race, and time of year did not have strong or consistant relationships with GOSE or RPQ13. The only factor that did show a consistant pattern and significance was years of education. GOSE score improves with years of education. Years of education is a significant predictor of GOSE after injury at 2wk-1mo and 3 months after injury, even when accounting for age and GCS. 6 months is not significant which may be due to the small sample size. RPQ13 which looks at symptoms also shows a significance or a trend toward significance at all time points suggesting that years of education is actually protective against chronic symptoms after TBI.

Conclusions

Given the heterogeneity of TBI, The sample size in this study is small. An n of 79 subjects with only 16 HCT- subjects makes the results of this study not 100% conclusive. However, the data from this study suggests that a HCT+ finding is an informative inclusion criteria for brain injury studies and should not be utilized going forward. Many of the current studies in our lab have already eliminated that criteria from our inclusion however we have collaborators still requiring it. Years of education appears to be predictive of improved outcome after a brain injury. Several papers have shown a similar finding, suggesting a potential cognitive reserve for TBI similar to what is seen in Alzheimer’s disease patients. A large part of this data set is being collected for a multi-center study called TRACK-TBI. There are currently over 15 institutions participating in TRACK-TBI, collecting similar data in the hopes of defining a new taxonomy for TBI. The data is available to any interested party, however, the time to recieve the TRACK-TBI data can take up to a year. The subject pool contains 1000s of subjects and our lab is currently in the process of requesting that data for other projects. These analyses can be extended to more subjects in the future by using the TRACK-TBI dataset.